Image representation method and apparatus

ABSTRACT

A colour image comprises colour values in each of one or more colour channels for each of a plurality of points, or pixels, within the image. The image is represented by rank ordering the values in the or each colour channel. The image representation generated in this way is usable for automated-vision or computer-vision tasks, for example.

Colour can potentially provide useful information to a variety ofcomputer vision tasks such as image segmentation, image retrieval,object recognition and tracking. However, for it to be helpful inpractice, colour must relate directly to the intrinsic properties of theimaged objects and be independent of imaging conditions such as sceneillumination and the imaging device. To this end many invariant colourrepresentations have been proposed in the literature. Unfortunately,recent work [3] has shown that none of them provides good enoughpractical performance.

SUMMARY OF THE INVENTION

The invention provides a method and an apparatus for representing,characterising, transforming or modifying a colour image as defined inthe appended independent claims. Preferred or advantageous features ofthe invention are set out in dependent subclaims.

In this paper we propose a new colour invariant image representationbased on an existing grey-scale image enhancement technique: histogramequalisation. We show that provided the rank ordering of sensorresponses are preserved across a change in imaging conditions (lightingor device) a histogram equalisation of each channel of a colour imagerenders it invariant to these conditions. We set out theoreticalconditions under which rank ordering of sensor responses is preservedand we present empirical evidence which demonstrates that rank orderingis maintained in practice for a wide range of illuminants and imagingdevices. Finally we apply the method to an image indexing applicationand show that the method outperforms all previous invariantrepresentations, giving close to perfect illumination invariance andvery good performance across a change in device.

1 INTRODUCTION

It has long been argued that colour (or RGB) images provide usefulinformation which can help in solving a wide range of computer visionproblems. For example it has been demonstrated [14, 1] thatcharacterising an image by the distribution of its colours (RGBS) is aneffective way to locate images with similar content from amongst adiverse database of images. Or that a similar approach [19] can be usedto locate objects in an image. Colour has also been found to be usefulfor tasks such as image segmentation [12, 13] and object tracking [7,16]. Implicit in these applications of colour is the assumption that thecolours recorded by devices are an inherent property of the imagedobjects and thus a reliable cue to their identity. Unfortunately acareful examination of image formation reveals that this assumption isnot valid. The RGB that a camera records is more properly a measure ofthe light reflected from the surface of an object and while this doesdepend in part on characteristics of the object, it depends in equalmeasure on the composition of the light which is incident on the objectin the first place. So, an object that is lit by an illuminant which isitself reddish, will be recorded by a camera as more red than will thesame object lit under a more bluish illuminant That is, image RGBs areillumination dependent. In addition image colour also depends on theproperties of the recording device. Importantly, different imagingdevices have different sensors which implies that an object thatproduces a given RGB response in one camera might well produce a quitedifferent response in a different device.

In recognition of this fact many researchers have sought modified imagerepresentations such that one or more of these dependencies are removed.Researchers have to-date, concentrated on accounting for illuminationdependence and typically adopt one of two approaches: colour invariant[9, 5, 6, 10, 18] or colour constancy [11, 8] methods. Colour invariantapproaches seek transformations of the image data such that thetransformed data is illuminant independent whereas colour constancyapproaches set out to determine an estimate of the light illuminating ascene and provide this estimate in some form to subsequent visionalgorithms.

Colour constancy algorithms, in contrast to invariant approaches candeliver true object colours. Moreover, colour invariants can becalculated post-colour constancy processing but the converse is nottrue. This said, colour constancy has proven to be a harder problem tosolve than colour invariants. Most importantly however, it has beendemonstrated [8, 3] that the practical performance of neither approachis good enough to facilitate colour-based object recognition or imageretrieval. Moreover, none of the methods even attempts to account fordevice dependence.

In this paper we seek to address the limitations of these existingapproaches by defining a new image representation which we show is bothillumination independent and (in many cases) also device independent Ourmethod is based on the observation that while a change in illuminationor device leads in practice to significant changes in the recorded RGBs,the rank orderings of the responses of a given sensor are largelypreserved. In facts we show in this paper (§3) that under certainsimplifying assumptions, invariance of rank ordering follows directlyfrom the image formation equation. In addition we present an empiricalstudy (§4) which reveals that the preservation of rank ordering holds inpractice both across a wide range of illuminants and a variety ofimaging devices. Thus, an image representation which is based on rankordering of recorded RGBs rather than on the RGBs themselves offers thepossibility of accounting for both illumination and device dependence atthe same time.

To derive an image representation which depends only on rank orderingswe borrow a tool which has long been used by the image processingcommunity for a quite different purpose. The technique is histogramequalisation and is typically applied to grey-scale images to produce anew image which is enhanced in the sense that the image has morecontrast and thus conveys more information. In some cases this resultsin a visually more pleasing image. But in a departure from traditionalimage processing practice, we apply the procedure not to a grey-scaleimage, but rather to each of the R, G, and B channels of a colour imageindependently of one another. We show that provided two images differ insuch a way as to preserve the rank ordering of pixel values in each ofthe three channels then an application of histogram equalisation to eachof the channels of the two images results in a pair of equivalentimages. Thus provided a change in illuminant or device preserves rankordering of pixel responses the application of histogram equalisationwill provide us with an invariant representation of a scene which mightsubsequently be of use in a range of vision applications.

Of course the reader may be surprised that we propose something sosimple: histogram equalisation is a common tool. Paradoxically however,histogram equalising R, G, and B channels of an image is generallydiscouraged because this results in unnatural pseudo-colours. For ourpurposes, however—recognition or tracking—pseudo-colours suffice. Wedemonstrate this (§5) by applying the method to the problem of colourindexing: we show that the method out performs all previous approachesand in the case of a change in illumination provides close to perfectindexing.

2 BACKGROUND

We base our work on a simple model of image formation in which theresponse of an imaging device to an object depends on three factors: thelight by which the object is lit, the surface reflectance properties ofthe object, and the properties of the device's sensors. We assume that ascene is illuminated by a single light characterised by its spectralpower distribution which we denote E(λ) and which specifies how muchenergy the source emits at each wavelength (λ) of the electromagneticspectrum. The reflectance properties of a surface are characterised by afunction S(λ) which defines what proportion of light incident upon itthe surface reflects on a per-wavelength basis. Finally a sensor ischaracterised by R_(k)(λ), its spectral sensitivity function whichspecifies its sensitivity to light energy at each wavelength of thespectrum. The subscript k denotes that this is the kth sensor. Itsresponse is defined as: $\begin{matrix}{{p_{k} = {\int_{\omega}{{E(\lambda)}{S(\lambda)}{R_{k}(\lambda)}\quad{\mathbb{d}\lambda}}}},{k = 1},\ldots\quad,m} & (1)\end{matrix}$where the integral is taken over the range of wavelengths ω: the rangefor which the sensor has non-zero sensitivity. In what follows we assumethat our devices (as most devices do) have three sensors (m=3) so thatthe response of a device to a point in a scene is represented by atriplet of values: (p₁, p₂, p₃) It is common to denote these triplets asR, G, and B or just RGBs and so we use the different notationsinterchangeably throughout. An image is thus a collection of RGBsrepresenting the device's response to light from a range of positions ina scene.

Equation (1) makes clear the fact that a device response depends both onproperties of the sensor (it depends on R_(k)(λ)) and also on theprevailing illumination on (E(λ)). That is, responses are both deviceand illumination dependent. It follows that if no account is taken ofthese dependencies, an RGB cannot correctly considered to be anintrinsic property of an object and is employing it as such is quitelikely to result in poor results.

An examination of the literature reveals many attempts to deal with theillumination dependence problem. One approach is to apply a correctionto the responses recorded by a device to account for the colour of theprevailing scene illumination. Provided an accurate estimate of thescene illumination can be obtained, such a correction accounts well forthe illumination dependence, rendering responses colour constant: thatis stable across a change in illumination. The difficulty with thisapproach is the fact that estimating the scene illuminant isnon-trivial. In 1998 Funt et al [8] demonstrated that existing colourconstancy algorithms are not sufficiently accurate to make such anapproach viable. More recent work [11] has shown that for simple imagingconditions and given good device calibration the colour constancyapproach can work.

A different approach is to derive from the image data some newrepresentation of the image which is invariant to illumination. Suchapproaches are classified as colour (or illuminant) invariant approachesand a wide variety of invariant features have been proposed in theliterature. Accounting for a change in features have been proposed inthe literature. Accounting for a change in illumination colour ishowever difficult because, as is clear from Equation (1), theinteraction between light, surface and sensor is complex. Researchershave attempted to reduce the complexity of the problem by adoptingsimple models of illumination change. One of the simplest models is theso called diagonal model in which it is proposed that sensor responsesunder a pair of illuminants are related by a diagonal matrix transform:$\begin{matrix}{\begin{pmatrix}R^{c} \\G^{c} \\B^{c}\end{pmatrix} = {\begin{pmatrix}\alpha & 0 & 0 \\0 & \beta & 0 \\0 & 0 & \gamma\end{pmatrix}\begin{pmatrix}R^{o} \\G^{o} \\B^{o}\end{pmatrix}}} & (2)\end{matrix}$where the superscripts o and c characterise the pair of illuminants. Themodel is widely used, and has been shown to be well justified under manyconditions [7]. Adopting such a model one simple illuminant invariantrepresentation of an image can be derived by applying the followingtransform: $\begin{matrix}{{R^{\prime} = \frac{R}{R_{ave}}},{G^{\prime} = \frac{G}{G_{ave}}},{B^{\prime} = \frac{B}{B_{ave}}}} & (3)\end{matrix}$where the triplet trial (R_(ave), G_(ave), B_(ave)) denotes the mean ofall RGBs in an image. It is easy to show that this so called Greyworldrepresentation of an image is illumination invariant provided thatEquation (2) holds.

Many other illuminant invariant representations have been derived, insome cases [10] by adopting different models of image formation. Allderived invariants however share two common failings: first it has beendemonstrated that when applied to the practical problem of imageretrieval none of these invariants affords good enough performanceacross a change in illumination. Secondly, none of these approachesconsiders the issue of device invariance.

Device invariance occurs because different devices have differentspectral sensitivity functions (different R_(k) in Equation (1)) butalso because the colours recorded by a device are often not linearlyrelated to scene radiance as Equation (1) suggests, but rather are somenon-linear transform of this: $\begin{matrix}{{p_{k} = {f\left( {\int_{\omega}{{E(\lambda)}{S(\lambda)}{R_{k}(\lambda)}\quad{\mathbb{d}\lambda}}} \right)}},{k = 1},\ldots\quad,m} & (4)\end{matrix}$

The transform f( ) is deliberately applied to RGB values recorded by adevice for a number of reasons. First, many captured images willeventually be displayed on a monitor. Importantly colours displayed on ascreen are not a linear function of the RGBs sent to the monitor.Rather, there exists a power function relationship between the incomingvoltage and the displayed intensity. This relationship is known as thegamma of the monitor, where gamma describes the exponent of the powerfunction [15]. To compensate for this gamma function images are usuallystored in a way that reverses the effect of this transformation: that isby applying a power function with exponent of 1/γ, where γ describes thegamma of the monitor, to the image RGBs. Importantly monitor gammas arenot unique but can vary from system to system and so images from twodifferent devices will not necessarily have the same gamma correctionapplied. In addition to gamma correction other more general non-linear“tone curve” corrections are often applied to images so as to changeimage contrast with the intention of creating a visually more pleasingimage. Such transformations are device, and quite often, image dependentand so lead, inevitably to device dependent colour. In the next sectionwe address the limitations of existing invariant approaches byintroducing a new invariant representation.

3 HISTOGRAM EQUALISATION FOR COLOUR INVARIANCE

Let us begin by considering again the diagonal model of image formationdefined by Equation (2). We observe that one implication of this modelof illumination change is that the rank ordering of sensor responses ispreserved under a change of illumination. To see this, consider theresponses to a single sensor R, such that R^(o) _(i) represents theresponse to a surface i under an illuminant o. Under a secondilluminant, which we denote c, the surface will have response R^(c) _(i)and the pair of sensor responses are related by:R_(i) ^(c)=αR_(i) ^(o)   (5)Equation (5) is true for all surfaces (that is, ∀ i). Now, consider apair of surfaces, i and j, viewed under illuminant o and suppose thatR^(o) _(i)>R^(o) _(j), then it follows from Equation (5) that:R_(i) ⁶>R_(j) ^(o)

αR_(i) ^(o)>αR_(j) ^(o)

R_(i) ^(c)>R_(j) ^(c) ∀ i,j, ∀:α  (6)

That is, the rank ordering of sensor responses within a given channel isinvariant to a change in illumination

Thus, if what we seek is an image representation which is invariant toillumination we can obtain one by considering not the pixel valuesthemselves but rather the relative ordering of these values. There are anumber of ways we might employ this rank ordering information to derivean invariant representation, we set forth one such method here which wewill demonstrate has a number of interesting properties. To understandour method consider a single channel of an RGB image recorded under anilluminant o where without loss of generality we restrict the range ofR^(o) to be on some finite interval: R^(o)ε [0 . . . R/max]. Now,consider further a value R^(o) _(i)ε[0 . . . R/max] where R^(o) _(i) isnot necessarily the value of any pixel in the image. Let us define byP(R^(o)<R^(o) _(i)), the number of pixels in an image with a value lessthan or equal to R^(o) _(j). Under a second illuminant, c, a pixel valueR^(o) under illuminant o is mapped to a corresponding value R_(c). Wedenote by P(R_(c)<R^(c) _(i)) the number of pixel values in the secondimage whose value is less than R^(c) _(i). Assuming that theillumination change preserves rank ordering of pixels we have thefollowing relation:P(R ^(c) <R _(i) ^(c))=P(R ^(o) <R _(i) ^(o))   (7)

That is, the number of pixels in our image under illuminant o which havea value less than R^(o)i is equal to the number of pixels in the imageunder illuminant c which have a value less than the transformed pixelvalue R^(c) _(i): a change in illumination preserves cumulativeproportions. Given this, we define one channel of the invariant imagerepresentation thus: $\begin{matrix}{R_{i}^{inv} = {{\frac{R_{\max}}{N_{pix}}{P\left( {R^{o} \leq R_{i}^{o}} \right)}} = {P\left( {R^{c} \leq R_{i}^{c}} \right)}}} & (8)\end{matrix}$where N_(pix) is the number of pixels and the constant R_(max)N_(pix)ensures that the invariant image has the same range of values as theinput image. Repeating the procedure for each channel of a colour imageresults in the required invariant image.

The reader familiar with the image processing literature might recogniseEquation (8). Indeed this transformation of image data is one of thesimplest and most widely used methods for image enhancement and iscommonly known as histogram equalisation. Histogram equalisation is animage enhancement technique originally developed for a single channel,or grey-scale, image. The aim is to increase the overall contrast in theimage since doing so typical brightens dark areas of an image,increasing the detail in those regions which in turn can sometimesresult in a more pleasing image. Histogram equalisation achieves thisaim by transforming an image such that the histogram of the transformedimage is as close as possible to a uniform histogram. The approach isjustified on the grounds that amongst all possible histograms, auniformly distributed histogram has maximum entropy. Maximising theentropy of a distribution maximises its information and thus histogramequalising an image maximises the information content of the outputimage. Accepting the theory, to histogram equalise an image we musttransform the image such that the resulting image histogram is uniform.Now, suppose that x_(i) represents a pixel value in the original imageand x′_(i) its corresponding value in the transformed image. We wouldlike to transform the original image such that the proportion of pixelsless than x′_(i) in the transformed image is equal to the proportion ofimage pixels less than x_(i) in the original image, and that moreoverthe histogram of the output image is uniform This implies:$\begin{matrix}{{\int_{0}^{x_{i}}{{p(x)}\quad{\mathbb{d}x}}} = {{\int_{0}^{x_{i}^{\prime}}{{q(x)}\quad{\mathbb{d}x}}} = {\frac{N_{pix}}{x_{\max}}{\int_{0}^{x_{i}^{\prime}}\quad{\mathbb{d}x}}}}} & (9)\end{matrix}$where p(x) and q(x) are the histograms of the original and transformedimages respectively. Evaluating the right-hand integral we obtain:$\begin{matrix}{{\frac{x_{\max}}{N_{pix}}{\int_{0}^{x_{i}}{{p(x)}\quad{\mathbb{d}x}}}} = x_{i}^{\prime}} & (10)\end{matrix}$

Equation (10) tells us that to histogram equalise an image we transformpixel values such that a value x_(i) in the original image is replacedby the proportion of pixels in the original image which are less than orequal to x^(i). A comparison of is Equation (8) and Equation (10)reveals that, disregarding notation, the two are the same, so that ourinvariant image is obtained by simply histogram equalising each of thechannels of our original image.

In the context of image enhancement it is argued [20] that applying anequalisation to the channels of a colour image separately isinappropriate since this can produce significant colour shifts in thetransformed image. However our context we are interested not in thevisual quality of the image but in obtaining a representation which isilluminant and/or device invariant. Histogram equalisation achieves justthis provided that the rank ordering of sensor responses is itselfinvariant to such changes. In addition, by applying histogramequalisation to each of the colour channels we maximise the entropy ineach of those channels. This in itself seems desirable since our intentin computer vision is to use the representation to extract informationabout the scene and thus maximising the information content of our scenerepresentation ought to be helpful in itself.

4 INVARIANCE OF RANK ORDERING

The analysis above shows that the illiminant invariance of histogramequalised images follows directly from the assumption of a diagonalmodel of illumination change (Equation (2)). But the method does notrequire Equation (2) to hold to provide invariance. Rather, we requireonly that rank orderings of sensor responses remain (approximately)invariant under a change in illumination. In fact, the method is notrestricted to a change in lighting but to any transformation of theimage which leaves rank orderings unchanged. Consider for exampleEquation (3) which allows the image formation process to include anarbitrary nonlinear transform (denoted by f( )) of sensor responses).Different transforms f( ), lead of course to different images. But notethat, the rank ordering of sensor responses will be preserved providedthat f( ) is a monotonic increasing function. Thus, histogram equalisedimages are invariant to monotonic increasing functions. This fact isimportant because many of the transformations such as gamma ortone-curve corrections which are applied to images, satisfy thecondition of monotonicity.

To investigate further the rank invariance of sensor responses weconducted a similar experiment to that of Dannemiller [2] whoinvestigated to what extent the responses of cone cells in the human eyemaintain their rank ordering under a change in illumination. He foundthat to a very good approximation rank orderings were maintained. Here,we extend the analysis to investigate a range of different devices inaddition to a range of illuminants. To investigate the invariance ofrank orderings of sensor responses for a single device under changingillumination we proceed as follows. Let R_(k) represent the spectralsensitivity of the kth sensor of the device we wish to investigate. Nowsuppose we calculate (according to Equation (1)) the responses of thissensor to a set of surface reflectance functions under a fixedilluminant E¹(λ). We denote those responses by the vector P¹ _(k).Similarly we denote by P² _(k) the responses of the same sensor to thesame surfaces viewed under a second illuminant E²(λ). Next we define afunction rank( ) which takes a vector argument and returns a vectorwhose elements contain the rank of the corresponding element in theargument Then, if sensor responses are invariant to the illuminants E¹and E², the following relationship must hold:rank(P _(k) ¹)=rank(P _(k) ²)   (11)

In practice the relationship in Equation (11) will hold onlyapproximately and we can assess how well the relationship holds usingSpearman's Rank Correlation coefficient which is given by:$\begin{matrix}{\rho = {1 - {6{\sum\limits_{j = 1}^{N}\frac{d_{j}^{2}}{N_{s}\left( {N_{s}^{2} - 1} \right)}}}}} & (12)\end{matrix}$where d_(j) is the difference between the jth elements of rank (P¹ _(k))and rank (P² _(k)) and N_(s) is the number of surfaces. This coefficienttakes a value between 0 and 1: a coefficient of zero implies thatEquation (11) holds not at all, while a value of one will be obtainedwhen the relationship is exact. Invariance of rank ordering acrossdevices can be assessed in a similar way by defining two vectors: P¹_(k) defined as above and Q_(k) representing sensor responses of the kthclass of sensor of a second device under the illuminant E¹. Bysubstituting these vectors in Equation (12) we can measure the degree ofrank correlation. Finally we can investigate rank order invarianceacross device and illumination together by comparing, for example, thevectors P² _(k) and Q¹ _(k).

We conducted such an analysis for a variety of imaging devices andilluminants, taking as our surfaces, a set of 462 Munsell chips [21]which represent a wide range of reflectances that might occur in theworld. For illuminants we chose 16 different lights, including a rangeof daylight illuminants, Planckian blackbody radiators, and fluorescentlights, again representing a range of lights which we will meet in theworld. Finally, we used the spectral sensitivities of the human colourmatching functions [21] as well as those of four digital still camerasand a flatbed scanner. TABLE 1 Spearman's Rank Correlation Coefficientfor each sensor of a range of devices. Results are averaged over allpairs of a set of 16 illuminants. 1^(st) Sensor 2^(nd) Sensor 3^(rd)Sensor Across Illumination Colour Matching Functions 0.9957 0.99220.9992 Camera 1 0.9983 0.9984 0.9974 Camera 2 0.9978 0.9938 0.9933Camera 3 0.9979 0.9984 0.9972 Camera 4 0.9981 0.9991 0.9994 Scanner0.9975 0.9989 0.9995 Across Devices Illuminant Daylight (D65) 0.98770.9934 0.9831 Fluorescent (cwf) 0.9931 0.9900 0.9710 Tungsten (A) 0.99360.9814 0.9640 Across Device and Illuminant 0.9901 0.9886 0.9774

Table 1 summarises the results for the case in which sensor is fixed andillumination is allowed to change, using the measure defined by Equation(12). Values are shown for each device averaged over all 16 illuminantsin our illuminant set, for three illuminants (daylight, fluorescent, andtungsten) averaged over all devices, and for all devices andilluminants. In all cases, the results show a very high degree ofcorrelation: average correlation never falls below 0.964 whichrepresents a high degree of correlation. Minimum correlation over alldevices and illuminants was 0.9303 for the 1^(st) sensor, 0.9206 for the2^(nd) sensor and 0.8525 for the 3^(rd). Thus on the basis of theseresults we conclude that rank orderings are preserved to a very goodapproximation across a change in either or both, device andillumination.

5 AN APPLICATION TO COLOUR INDEXING

To test the invariance properties of histogram equalisation further weapplied the method to an image retrieval task Finlayson et al [3]recently investigated whether existing invariant approaches were able tofacilitate good enough image indexing across a change in either, or bothillumination and device. Their results suggested that the answer to thisquestion was no. Here we repeat their experiment but using histogramequalised images as our basis for indexing to investigate whatimprovement, if any, the method brings. TABLE 2 Average Match Percentileresults of the indexing experiment for four different cases: (1) AcrossIllumination, (2) Across cameras, (3) Across scanners, and (4) Acrossall devices. Colour Model Case (1) Case (2) Case (3) Case (4) RGB 63.2371.85 98.88 65.53 Greyworld 93.96 94.22 99.34 92.28 Hist Eq. 96.72 95.5298.94 94.54

The experiment is based on a database of images of coloured texturescaptured under a range of illuminants and devices and described in [4].In summary there are 28 different coloured textures each captured undersix different devices (4 cameras and 2 scanners). In addition eachcamera was used to capture each of the textures under 3 different lightsso that in total there are (3×4+2)×28=392 images. In image indexingterms this is a relatively small database and it is chosen because itallows us to investigate performance across a change in illumination anddevice. In our experiments we tested indexing performance across threedifferent conditions: (1) across illumination, (2) across homogenousdevices, and (3) across heterogeneous devices (4). In each case theexperimental procedure was as follows. First, we choose a set of 28images all captured under the same conditions (same device andilluminant) to be our image database. Next we select from the remainingset of images a subset of appropriate query images. So, if we aretesting performance across illumination, we select as our query imagesthe 56 images captured by the device corresponding to the databaseimages, under the two non-database illuminants. Then for all databaseand query images we derive an invariant image using either the histogramequalisation method set forth above, or one of a range of previouslypublished [9, 5, 6, 10, 18] invariant methods. Finally we represent theinvariant image by its colour distribution: that is, by a histogram ofthe pixel values in the invariant image. All results reported here arebased on 3-dimensional histograms of dimension 16×16×16.

Indexing is performed for a query image by comparing its histogram toeach of the histograms of the database images. The database image whosehistogram is most similar to the query histogram is retrieved as a matchto the query image. We compare histograms using the intersection methoddescribed by Swain et al [19] which we found to give the best results onaverage. Indexing performance is measured using average match percentile[19] which gives a value between 0 is and 100%. A value of 99% impliesthat the correct image is ranked amongst the top 1% of images whilst avalue of 50% corresponds to the performance we would achieve usingrandom matching.

Table 2 summarises the average match percentile results for the fourdifferent conditions. In addition to results for histogram equalisationwe also show results based on histograms of the original images (RGB),and on Greyworld normalised images, that is on images calculatedaccording to Equation (3). Results for a variety of other invariantrepresentations can be found in [3]: all perform significantly worsethan Greyworld. Significantly histogram equalisation outperformsGreyworld for all conditions. Histogram equalisation results across achange in illumination are very good: an AMP of close to 97% as comparedto 94% for the second best method. Results of matching acrosshomogeneous devices (Cases 2 and 3 in Table 2) show that both Greyworldand histogram equalisation perform similarly with histogram equalisationperforming slightly better on average. Finally across heterogeneousdevices histogram equalisation performs best.

While the results are quite good: the histogram equalisation methodclearly outperforms all other methods on average, the experiment doesraise a number of issues. First, it is surprising that one of thesimplest invariants—Greyworld—performs as well as it does, whilst othermore sophisticated invariants perform very poorly. This performanceindicates that for this dataset a diagonal scaling of sensor responsesaccounts for most of the change that occurs when illuminant or device ischanged. It also suggests that any non-linear transform applied to thedevice responses post-capture (the function f( ) in Equation (2)) mustbe very similar for all devices: most likely a simple power function isapplied Secondly, we might have expected that histogram equalisationwould have performed somewhat better than it does. In particular, whileillumination invariance is very good, device invariance is somewhat lessthan we might have hoped for given the analysis in Section 4. Aninvestigation of images for which indexing performance was poor revealsa number of artefacts of the capture process which might account for theperformance. First, a number of images captured under tungstenillumination have values of zero in the blue channel for many pixels.Second, a number of the textures have uniform backgrounds but thescanning process introduces significant non-uniformities in theseregions. For both cases the resulting histogram equalised images are farfrom invariant. Excluding these images leads to a significantimprovement in indexing performance. However, for an invariant imagerepresentation to be of practical use in an uncalibrated environment itmust be robust to the limitations of the imaging process. Thus we havereported results including all images. And we stress again in summary,that the simple technique of histogram equalisation, posited only on theinvariance of rank ordering across illumination and/or deviceoutperforms all previous invariant methods and in particular givesexcellent performance across changes in illumination.

REFERENCES

-   [1] J. Bach, C. Fuller, A Gupta, A Hampapur, B. Horowitz, R    Humphrey, and R. Jain. The virage image search engine: An open    framework for image management. In SPIE Conf. on Storage and    Retrieval for Image and Video Databases, volume 2670, pages 76-87,    1996.-   [2] James L. Dannemiller. Rank orderings of photoreceptor photon    catches from natural objects are nearly illuminant-invariant. Vision    Research, 33(1):131-140, 1993.-   [3] G. Finlayson and G. Schaefer. Colour indexing across devices and    viewing conditions. In 2nd International Workshop on Content-Based    Multimedia Indexing, 2001.-   [4] G. Finlayson, G. Schaefer, and G. Y. Tian. The UEA uncalibrated    colour image database. Technical Report SYS-COO-07, School of    Information Systems, University of East Anglia, Norwich, United    Kingdom, 2000.-   [5] G. D. Finlayson, S. S. Chatteijee, and B. V. Funt. Color angular    indexing. In The Fourth European Conference on Computer Vision (Vol    II), pages 16-27. European Vision Society, 1996.-   [6] G. D. Finlayson, B. Schiele, and J. Crowley. Comprehensive    colour image normalization. In eccv98, pages 475-490, 1998.-   [7] Graham Finlayson. Coefficient Colour Constancy. PhD thesis,    Simon Fraser University, 1995.-   [8] Brian Funt, Kobus Baxnard, and Lindsay Martin. Is machine colour    constancy good enough? In 5th European Conference on Computer    Vision, pages 455-459. Springer, June 1998.-   [9] Brian V. Funt and Graham D. Finlayson. Color Constant Color    Indexing. IEEE Transactions on Pattern Analysis and Machine    Intelligence, 17(5):522-529, 1995.-   [10] T. Gevers and A. W. M. Smeulders. Color based object    recognition Pattern Recognition, 32:453-464, 1999.-   [11] Graham. D. Finlayson, Steven Hordley, and Paul Hubel.    Illiminant estimation for object recognition. COLOR research and    application, 2002. to appear.-   [12] G. Healey. Using color for geometry-insensitive segmentation.    Journal of the Optical Society of America, A., 6:920-937, 1989.-   [13] B. A. Maxwell and S. A. Shafer. A framework for segmentation    using physical models of image formation. In Computer vision and    Pattern recognition, pages 361-368. IEEE, 1994.-   [14] W. Niblack and R. Barber. The QBIC project: Querying images by    content using color, texture and shape. In Storage and Retrieval for    Image and Video Databases I, volume 1908 of SPIE Proceedings Series.    1993.-   [15] C. Poynton. The rehabilitation of gamma. In SPIE Conf. Human    Vision and Electronic Imaging III, volume 3299, pages 232-249, 1998.-   [16] Yogesh Raja, Stephen J. McKenna, and Shaogang Gong. Colour    model selection and adaptation in dynamic scenes. In 5th European    Conference on Computer Vision, pages 460-474. Springer, June 1998.-   [17] B. Schiele and A. Waibel. Gaze tracking based on face-color. In    International Workshop on Automatic Face- and Gesture-Recognition,    June 1995.-   [18] M. Stricker and M. Orengo. Similarty of color images. In SPIE    Conf. on Storage and Retrieval for Image and Video Databases III,    volume 2420, pages 381-392, 1995.-   [19] Michael J. Swain and Dana H. Ballard. Color Indexing.    International Journal of Computer Vision, 7(1):11-32, 1991.-   [20] Alan Watt and Fabio Policarpo. The Computer Image.    Addison-Wesley, 1997.-   [21] G. Wyszecli and W. S. Stiles. Color Science: Concepts and    Methods, Quantitative Data and Formulas. New York:Wiley, 2nd    edition, 1982.

1-15. (canceled)
 16. A method for representing a colour image havingvalues in one or more colour channels for each of a plurality of pointswithin the image, comprising the step of rank ordering the values in atleast one of the colour channels.
 17. A method according to claim 16,further comprising the step of generating a histogram of the values inthe or each colour channel.
 18. A method according to claim 17, furthercomprising the step of equalising the or each histogram.
 19. A methodaccording to claim 17, further comprising the step of normalising the oreach histogram.
 20. A method according to claim 16, in which each pointwithin the image is a pixel.
 21. A method according to claim 16,comprising the step of, at each of the plurality of points within theimage, replacing the value in the or each-colour channel with a valuecorresponding to its rank.
 22. A method according to claim 16, in whichthe colour channels are RGB channels.
 23. A method according to claim16, in which the plurality of points within the image represents a partof the image.
 24. A method according to claim 23, in which the part ofthe image comprises proximate points or pixels.
 25. A method accordingto claim 16, comprising the step of using the colour imagerepresentation method in an automated-vision or computer-vision task,such as image segmentation, image retrieval, image indexing, image orobject recognition or object tracking.
 26. An image processing apparatusfor carrying out a method for representing a colour image as defined inclaim
 16. 27. A computer programmed to carry out a method forrepresenting a colour image as defined in claim
 16. 28. Acomputer-readable medium carrying a program for causing a generalpurpose computer to carry out a method for representing a colour imageas defined in claim
 16. 29. An image comprising values in each of one ormore colour channels for each of a plurality of points in the image,generated by transforming an original image using a method as defined inclaim 16.